Semantic Annotations for Biology: a Corpus Development Initiative at the Jena University Language & Information Engineering (JULIE) Lab

نویسندگان

  • Udo Hahn
  • Elena Beisswanger
  • Ekaterina Buyko
  • Michael Poprat
  • Katrin Tomanek
  • Joachim Wermter
چکیده

We provide an overview of corpus building efforts at the Jena University Language & Information Engineering (JULIE) Lab, which are focused on life science documents. Special emphasis is laid on semantic annotations in terms of a large amount of biomedical named entities (almost 100 entity types), semantic relations, as well as discourse phenomena, reference relations in particular.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

The CALBC Silver Standard Corpus - Harmonizing multiple semantic annotations in a large biomedical corpus

The CALBC initiative aims to provide a large-scale biomedical text corpus that contains semantic annotations for tagged named entities of different kinds. The generation of this corpus requires that the annotations from different automatic annotation systems are harmonized. In the first phase, the annotation systems from 5 participants (EMBL-EBI, EMC Rotterdam, NLM, JULIE Lab Jena, and Linguama...

متن کامل

BioTop and ChemTop - Top-Domain Ontologies for Biology and Chemistry

Holger Stenzhorn Stefan Schulz University Medical Center Freiburg Institute for Medical Biometry and Medical Informatics Stefan-Meier-Straße 26 79104 Freiburg, Germany [email protected] [email protected] Elena Beißwanger Udo Hahn University Language and Information Engineering (JULIE) Lab Fürstengraben 30 07743 Jena, Germany [email protected] udo.hahn@uni-j...

متن کامل

The GeneReg Corpus for Gene Expression Regulation Events - An Overview of the Corpus and its In-Domain and Out-of-Domain Interoperability

Despite the large variety of corpora in the biomedical domain their annotations differ in many respects, e.g., the coverage of different, highly specialized knowledge domains, varying degrees of granularity of the targeted relations, the specificity of linguistic grounding of relations and named entities referred to in the documents, etc. We here introduce GENEREG (Gene Regulation Corpus), the ...

متن کامل

The JULIE LAB MANTRA System for the CLEF-ER 2013 Challenge

We here describe the set-up for the system from the Jena University Language & Information Engineering (JULIE) Lab which participated in the CLEF-ER 2013 Challenge. The task of this challenge was to identify hitherto unknown translation equivalents for biomedical terms from several parallel text corpora. The languages being covered are English, German, French, Spanish and Dutch. Our translation...

متن کامل

Towards Enhanced Interoperability for Large HLT Systems: UIMA for NLP

We introduce JCORE, a full-fledged UIMA-compliant component repository for complex text analytics developed at the Jena University Language & Information Engineering (JULIE) Lab. JCORE is based on a comprehensive type system and a variety of document readers, analysis engines, and CAS consumers. We survey these components and then turn to a discussion of lessons we learnt, with particular empha...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008